feat: reduce np.ndarray #1462

cmp0xff · 2025-10-30T22:03:04Z

Addresses Strict report as of 3/13/2025 #1171 : Change references to npt.NDArray to be npt.NDArray[Any]

MarcoGorelli · 2025-10-31T10:20:17Z

pandas-stubs/core/indexers/objects.pyi

-    ) -> tuple[np.ndarray, np.ndarray]: ...
+    ) -> tuple[npt.NDArray[Any], npt.NDArray[Any]]: ...


is this a chance to make this more precise, like

tuple[np_1d_array[np.intp], np_1d_array[np.intp]]

?

I have no insight into this class / this class method, nor is it documented in any detail (personally I don't even know what "window bounds" means). It would be great if you suggest a few tests, or can we leave it in another PR?

sure happy to leave it til later!

I prefer the change suggested by @MarcoGorelli . See the docs here: https://pandas.pydata.org/docs/dev/user_guide/window.html#custom-window-rolling

The arrays have to be np_ndarray_int64

MarcoGorelli · 2025-10-31T10:22:39Z

pandas-stubs/core/indexes/category.pyi


 class CategoricalIndex(ExtensionIndex[S1], accessor.PandasDelegate):
-    codes: np.ndarray = ...
+    codes: npt.NDArray[Any] = ...


np_1d_array[np.intp]?

cmp0xff

/pandas_nightly

cmp0xff · 2025-10-31T10:31:06Z

pandas-stubs/core/indexers/objects.pyi

-    ) -> tuple[np.ndarray, np.ndarray]: ...
+    ) -> tuple[npt.NDArray[Any], npt.NDArray[Any]]: ...


I have no insight into this class / this class method, nor is it documented in any detail (personally I don't even know what "window bounds" means). It would be great if you suggest a few tests, or can we leave it in another PR?

cmp0xff · 2025-10-31T10:41:07Z

pandas-stubs/core/indexes/category.pyi


 class CategoricalIndex(ExtensionIndex[S1], accessor.PandasDelegate):
-    codes: np.ndarray = ...
+    codes: npt.NDArray[Any] = ...


…e/reduce-ndarray

pandas-stubs/_libs/interval.pyi

MarcoGorelli · 2025-10-31T11:01:09Z

pandas-stubs/_libs/interval.pyi

-    @property
-    def is_monotonic_increasing(self) -> bool: ...
-    def clear_mapping(self) -> None: ...
+class IntervalTree(IntervalMixin): ...


if the above comment is right, this can probably just be removed?

class IntervalTree is something implemented in pandas with Cython. I don't understand what's happening with Ctyhon, but would like to leave its presence in the stubs, even though it's empty. Please tell me if you have a good argument to remove it completely.

It's not a documented class, so we should remove it from the stubs. And then change the references in _MidDescriptor and _LengthDescriptor to be IntervalMixin

MarcoGorelli · 2025-10-31T11:05:29Z

pandas-stubs/core/base.pyi

    | np_ndarray_float
    | np_ndarray_complex
-    | dict[str, np.ndarray]
+    | dict[str, np_1darray[Any]]


the general pattern you seem to be following is:

for function arguments, don't be restrictive on shape

for return types, be as precise as possible

Given that NumListLike is often used as argument, would this one be better as Any shape?

for function arguments, don't be restrictive on shape

This is relevant until the end of python 3.10 because it can only use an old version of numpy, which does not give the correct shape upon construction, i.e. np.array([1, 2, 3]) does not give a 1-d array with that old version, in static typing.

would this one be better as Any shape?

Yep dc85974

cmp0xff

/pandas_nightly

pandas-stubs/_libs/interval.pyi

cmp0xff · 2025-10-31T12:52:28Z

pandas-stubs/_libs/interval.pyi

-    @property
-    def is_monotonic_increasing(self) -> bool: ...
-    def clear_mapping(self) -> None: ...
+class IntervalTree(IntervalMixin): ...


class IntervalTree is something implemented in pandas with Cython. I don't understand what's happening with Ctyhon, but would like to leave its presence in the stubs, even though it's empty. Please tell me if you have a good argument to remove it completely.

cmp0xff · 2025-10-31T12:56:35Z

pandas-stubs/core/base.pyi

    | np_ndarray_float
    | np_ndarray_complex
-    | dict[str, np.ndarray]
+    | dict[str, np_1darray[Any]]


for function arguments, don't be restrictive on shape

This is relevant until the end of python 3.10 because it can only use an old version of numpy, which does not give the correct shape upon construction, i.e. np.array([1, 2, 3]) does not give a 1-d array with that old version, in static typing.

would this one be better as Any shape?

Yep dc85974

…e/reduce-ndarray

Dr-Irv

A few main issues:

I think we should be consistent in the stubs in using the np_ndarray_xxx types that are now in _typing.pyi , rather than npt.NDArray[np.integer], etc. I think it's better to use those types that we have in _typing.pyi so we can be consistent througout the stubs.
pd.unique(someIndex) and Index.unique() are a bit of a mess, because the behavior changes from 2.3 to 3.0. See the release notes: https://pandas.pydata.org/docs/dev/whatsnew/v3.0.0.html#other . So what we need to do here is return the numpy types as is done in 2.3, and put in a comment to switch to the Index types when 3.0 is released.
I don't think we should use npt.NDArray[np.double] in the stubs. We always use np.floating
There are places where we can be more precise on what are the allowed numpy array types as arguments.

I didn't necessarily pick up everything related to the main issues, and some of my comments are related to the 3 issues above.

Dr-Irv · 2025-11-04T15:02:32Z

pandas-stubs/_libs/interval.pyi

-    @property
-    def is_monotonic_increasing(self) -> bool: ...
-    def clear_mapping(self) -> None: ...
+class IntervalTree(IntervalMixin): ...


It's not a documented class, so we should remove it from the stubs. And then change the references in _MidDescriptor and _LengthDescriptor to be IntervalMixin

Dr-Irv · 2025-11-04T15:10:21Z

pandas-stubs/core/arrays/datetimelike.pyi

    def __iter__(self): ...
    @property
-    def asi8(self) -> np.ndarray: ...
+    def asi8(self) -> np_1darray[Any]: ...


Result of this will be np_ndarray_int64

Dr-Irv · 2025-11-04T15:12:26Z

pandas-stubs/core/groupby/groupby.pyi

        ignore_na: bool = ...,
        axis: Axis = ...,
-        times: str | np.ndarray | Series | np.timedelta64 | None = ...,
+        times: str | npt.NDArray[Any] | Series | np.timedelta64 | None = ...,


Dr-Irv · 2025-11-04T15:18:41Z

pandas-stubs/core/groupby/grouper.pyi

+    @cache_readonly
+    def groups(self) -> PrettyDict[Hashable, Index]: ...


Not sure why you added this. I think we can delete the class Grouping, because it is only used by the class BaseGrouper, which is only used by the class BinGrouper, which is never referenced.

None of those are documented.

Dr-Irv · 2025-11-04T15:21:38Z

pandas-stubs/core/indexers/objects.pyi

-    ) -> tuple[np.ndarray, np.ndarray]: ...
+    ) -> tuple[npt.NDArray[Any], npt.NDArray[Any]]: ...


I prefer the change suggested by @MarcoGorelli . See the docs here: https://pandas.pydata.org/docs/dev/user_guide/window.html#custom-window-rolling

The arrays have to be np_ndarray_int64

Dr-Irv · 2025-11-04T16:13:43Z

pandas-stubs/core/series.pyi

        ignore_na: _bool = False,
        axis: Axis = 0,
-        times: np.ndarray | Series | None = None,
+        times: npt.NDArray[Any] | Series | None = None,


Suggested change

times: npt.NDArray[Any] | Series | None = None,

times: np_ndarray_dt | Series | None = None,

Dr-Irv · 2025-11-04T16:18:25Z

tests/test_pandas.py

    )
    check(
-        assert_type(pd.unique(pd.RangeIndex(0, 10)), np.ndarray),
+        assert_type(pd.unique(pd.RangeIndex(0, 10)), np_1darray[Any] | pd.Index),


don't think this can be a numpy array

Dr-Irv · 2025-11-04T16:18:49Z

tests/test_pandas.py

        assert_type(
            pd.unique(pd.timedelta_range(start="1 day", periods=4)),
-            np.ndarray,
+            np_1darray[Any] | pd.Index,


UGH - looks like the type depends on the version of pandas.

So what we should do is in the stubs, use the numpy array as the return type. Add a comment that it will switch to the Index type with 3.0, and we'll make the change then

I think this is true for many of the pd.unique() overloads.

Dr-Irv · 2025-11-04T16:24:01Z

tests/test_pandas.py

+        assert_type(pd.unique(dti), np_1darray[np.datetime64] | pd.DatetimeIndex),
+        np_1darray if PD_LTE_23 else pd.DatetimeIndex,
+        np.datetime64,


same issue here with the change from 2.3 to 3.0. So stubs should return the 2.3 types, and we'll change it when 3.0 is released.

Dr-Irv · 2025-11-04T16:24:56Z

tests/test_pandas.py

-    check(assert_type(d1, npt.NDArray), np.ndarray)
-    check(assert_type(e0, npt.NDArray[np.intp]), np.ndarray)
-    check(assert_type(e1, npt.NDArray), np.ndarray)
+    check(assert_type(d1, np_1darray[np.double]), np_1darray[np.double])


don't think we should use np_1darray[np.double] in our stubs. Just use np_ndarray_float

same in tests below

cmp0xff added 2 commits October 30, 2025 22:23

reduce ndarray

d0477e4

py310 happiness

b88da43

MarcoGorelli reviewed Oct 31, 2025

View reviewed changes

https://github.com/pandas-dev/pandas-stubs/pull/1462/files#r2480913506

0023b02

cmp0xff commented Oct 31, 2025

View reviewed changes

Merge branch 'main' of github.com:pandas-dev/pandas-stubs into featur…

8c1a120

…e/reduce-ndarray

MarcoGorelli reviewed Oct 31, 2025

View reviewed changes

https://github.com/pandas-dev/pandas-stubs/pull/1462/files#r2481057032

dc85974

cmp0xff commented Oct 31, 2025

View reviewed changes

Merge branch 'main' of github.com:pandas-dev/pandas-stubs into featur…

1cfd06e

…e/reduce-ndarray

cmp0xff requested review from Dr-Irv and loicdiridollou November 4, 2025 12:43

cmp0xff added the Compat pandas objects compatability with Numpy or Python functions label Nov 4, 2025

Dr-Irv requested changes Nov 4, 2025

View reviewed changes

cmp0xff marked this pull request as draft November 4, 2025 22:33

		) -> tuple[np.ndarray, np.ndarray]: ...
		) -> tuple[npt.NDArray[Any], npt.NDArray[Any]]: ...

	times: str \| npt.NDArray[Any] \| Series \| np.timedelta64 \| None = ...,
	times: str \| np_ndarray_dt \| Series \| np.timedelta64 \| None = ...,

		@cache_readonly
		def groups(self) -> PrettyDict[Hashable, Index]: ...

	times: npt.NDArray[Any] \| Series \| None = None,
	times: np_ndarray_dt \| Series \| None = None,

Uh oh!

feat: reduce np.ndarray #1462

Are you sure you want to change the base?

feat: reduce np.ndarray #1462

Uh oh!

Conversation

cmp0xff commented Oct 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmp0xff left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmp0xff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dr-Irv left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants